Add @helix-db/migrate tool for Supabase to HelixDB migration#862
Add @helix-db/migrate tool for Supabase to HelixDB migration#862
Conversation
Adds @helix-db/migrate, a TypeScript CLI tool that automates migrating Supabase projects to HelixDB. The tool introspects a Supabase Postgres schema, auto-generates HelixDB .hx schema/query files, exports data, and imports it into a running HelixDB instance via the HTTP API. https://claude.ai/code/session_019kCQvvethhN4B7APgabxFy
| schemas: string[] | ||
| ): Promise<TableInfo[]> { | ||
| // Get all tables in the specified schemas | ||
| const schemaList = schemas.map((s) => `'${s}'`).join(","); |
There was a problem hiding this comment.
SQL injection vulnerability in schema list construction. The schemas array values are concatenated directly without parameterization.
| const schemaList = schemas.map((s) => `'${s}'`).join(","); | |
| const schemaList = schemas.map((_, i) => `$${i + 1}`).join(","); |
Then pass schemas as the second parameter to client.query() on line 79.
| client: Client, | ||
| schemas: string[] | ||
| ): Promise<Record<string, string[]>> { | ||
| const schemaList = schemas.map((s) => `'${s}'`).join(","); |
There was a problem hiding this comment.
Same SQL injection vulnerability in enum query.
| const schemaList = schemas.map((s) => `'${s}'`).join(","); | |
| const schemaList = schemas.map((_, i) => `$${i + 1}`).join(","); |
Then pass schemas as parameter array to the query.
| // For small tables, just SELECT all | ||
| if (table.rowCount <= batchSize) { | ||
| const result = await client.query( | ||
| `SELECT ${columnNames} FROM "${schema}"."${tableName}"` |
There was a problem hiding this comment.
SQL injection vulnerability with schema and table name interpolation.
|
|
||
| while (true) { | ||
| const result = await client.query( | ||
| `SELECT ${columnNames} FROM "${schema}"."${tableName}" ORDER BY ${orderBy} LIMIT $1 OFFSET $2`, |
There was a problem hiding this comment.
SQL injection vulnerability with schema, table, and order by column interpolation.
Additional Comments (1)
Consider using |
- Added a README.md file for the migration tool, providing usage instructions and setup guides. - Updated package.json to include additional files for packaging and added a prepack script to ensure the build process runs before packaging. - Modified importData function to accept an optional helixApiKey parameter, allowing for API key usage during data import. - Updated various functions to utilize the helixApiKey when making API calls to HelixDB. - Enhanced command-line options in index.ts to support helix-api-key and instance management features.
Description
This PR introduces
@helix-db/migrate, a comprehensive white-glove migration tool for moving projects from Supabase (PostgreSQL) to HelixDB. The tool automates the entire migration pipeline in five phases:Key Features
--introspect-only,--import-only, and full migration workflowsImplementation Details
Related Issues
Closes #
Checklist when merging to main
Additional Notes
The migration tool is designed to be user-friendly with:
The tool handles edge cases like:
https://claude.ai/code/session_019kCQvvethhN4B7APgabxFy
Greptile Overview
Greptile Summary
This PR introduces
@helix-db/migrate, a comprehensive CLI tool for migrating Supabase (PostgreSQL) projects to HelixDB. The tool automates schema introspection, conversion, data export, and import in five phases.Key changes:
Critical security issues found:
introspect.ts(lines 77, 279) where schema names are concatenated directly into queries without parameterizationexport-data.ts(lines 82, 105, 267) where schema/table names are interpolated without proper escapingOther observations:
Important Files Changed
Sequence Diagram
sequenceDiagram participant User participant CLI as CLI (index.ts) participant Intro as Introspect (introspect.ts) participant GenSchema as Generate Schema (generate-schema.ts) participant GenQueries as Generate Queries (generate-queries.ts) participant Export as Export Data (export-data.ts) participant Import as Import Data (import-data.ts) participant PG as PostgreSQL/Supabase participant Helix as HelixDB Instance User->>CLI: Run helix-migrate supabase CLI->>User: Prompt for connection string User->>CLI: Provide connection string Note over CLI,Intro: Phase 1: Introspection CLI->>Intro: introspectDatabase(connectionString, schemas) Intro->>PG: Query information_schema (tables, columns, PKs, FKs, indexes) PG-->>Intro: Schema metadata Intro->>PG: Query pg_catalog (enums, row counts) PG-->>Intro: Additional metadata Intro-->>CLI: SchemaIntrospection Note over CLI,GenQueries: Phase 2: Schema Generation CLI->>GenSchema: generateSchema(introspection) GenSchema->>GenSchema: Convert tables to Nodes GenSchema->>GenSchema: Convert FKs to Edges GenSchema->>GenSchema: Extract vector columns to Vectors GenSchema-->>CLI: GeneratedSchema (.hx files) CLI->>GenQueries: generateQueries(schema) GenQueries-->>CLI: CRUD and Import queries Note over CLI: Phase 3: Write Project Files CLI->>CLI: Write helix.toml, schema.hx, queries.hx, import.hx, MIGRATION_GUIDE.md CLI->>User: Prompt to export data User->>CLI: Confirm export Note over CLI,Export: Phase 4: Data Export CLI->>Export: exportData(connectionString, tables, outputDir) loop For each table Export->>PG: SELECT with pagination (OFFSET/LIMIT) PG-->>Export: Batch of rows Export->>Export: Transform and serialize (JSON, arrays, vectors) Export->>Export: Write to JSON file end Export-->>CLI: Export results CLI->>User: Prompt to import data into HelixDB User->>CLI: Confirm import Note over CLI,Import: Phase 5: Data Import CLI->>Import: importData(helixUrl, exportDir, schema, tables) Import->>Import: Topological sort nodes by FK dependencies loop For each node (sorted) Import->>Import: Read exported JSON file loop For each row (batched) Import->>Helix: POST /{ImportNodeQuery} with data Helix-->>Import: New HelixDB ID Import->>Import: Map old PK to new ID end end loop For each edge Import->>Import: Read source table JSON loop For each row with FK Import->>Import: Lookup from_id and to_id via ID mapping Import->>Helix: POST /{ImportEdgeQuery} with from_id, to_id Helix-->>Import: Edge created end end loop For each vector Import->>Import: Read exported JSON file loop For each row with vector Import->>Helix: POST /{ImportVectorQuery} with vector data Helix-->>Import: Vector created end end Import-->>CLI: Import results (counts, errors, ID mapping) CLI->>User: Migration complete! Display next stepsLast reviewed commit: e00872f